********************************************************************************
** 	TITLE:		CA97_results                                                  ** 
**	AUTHOR:	    Philippe Mongrain                                             **
**	DATA:       cand                                                          **
**	VERSION:	Stata 16					                                  **
**	DATE:		October 2022  				                                  **
********************************************************************************

* Version control

version 16.0

* Open log file

capture log close
log using "CA97_results.smcl", replace

* Download the "cand.txt" file from this link: https://www.elections.ca/content.aspx?section=res&document=candexe&dir=rep/off/37p&lang=e

* Import the dataset

import delimited cand.txt, delimiters(tab) clear

* Keep 1997 election results

keep if event_number == 3600

* Generate party variable

gen party = .

replace party = 1 if candidate_party_english_name == "Liberal"
replace party = 2 if candidate_party_english_name == "Bloc Québécois"
replace party = 3 if candidate_party_english_name == "Reform" | candidate_party_english_name == "Canadian Alliance"
replace party = 4 if candidate_party_english_name == "Progressive Conservative"
replace party = 5 if candidate_party_english_name == "N.D.P."
replace party = 97 if party == .

* Generate rank of parties

gsort ed_code -candidate_vote
bysort ed_code : gen rank97 = _n

gen first97 = party if rank97 == 1
gen second97 = party if rank97 == 2
gen third97 = party if rank97 == 3

bysort ed_code : gen winner97 = first97[1]
bysort ed_code : gen runnerup97 = second97[2]
bysort ed_code : gen thirdplace97 = third97[3]

* Generate margins of victory

bysort ed_code : gen margin97 = candidate_vote_percentage[1] - candidate_vote_percentage[2]

* Generate effective number of electoral parties

gen sqrvotepr97 = (candidate_vote_percentage/100)^2

bysort ed_code : egen sumsqrvotepr97 = sum(sqrvotepr97)

gen enep97 = 1/sumsqrvotepr97

* Labelling parties

label define party 1 "Liberal Party" 2 "Bloc Quebecois" 3 "Reform/Canadian Alliance" 4 "Progressive Conservative" 5 "NDP" 97 "Other"
label values winner97 runnerup97 thirdplace97 party

* Generate district numbers

replace ed_code = subinstr(ed_code, "-", "",.) 
replace ed_code = subinstr(ed_code, "ED", "",.)

* Selecting, ordering, and sorting variables

keep ed_code party rank97 candidate_vote candidate_vote_percentage margin97 enep97 winner97 runnerup97 thirdplace97

order ed_code party rank97 candidate_vote candidate_vote_percentage margin97 enep97 winner97 runnerup97 thirdplace97

sort ed_code rank97

drop if rank97 > 3

* Drop duplicates

duplicates tag ed_code, gen(dup)
duplicates drop ed_code, force
drop dup

* Create election variable

gen election97 = "1997 Canadian federal election"

* Saving file

save CA97_results, replace

log close